13. Minimizing Error Functions

Minimizing Error Functions

00:00
00:00

INSTRUCTOR NOTE:

NOTE: From 2:22 onward, the slide title should say "Mean Absolute Error".

Development of the derivative of the error function

Notice that we've defined the squared error to be

Error=12(yy^)2.Error = \frac{1}{2} (y - \hat{y})^2.

Also, we've defined the prediction to be

y^=w1x+w2.\hat{y} = w_1 x + w_2.

So to calculate the derivative of the Error with respect to
w1w_1
, we simply use the chain rule:

w1Error=Errory^y^wi.\frac{\partial}{\partial w_1} Error = \frac{\partial Error}{\partial \hat{y}} \frac{\partial \hat{y}}{\partial w_i}.

The first factor of the right hand side is the derivative of the Error with respect to the prediction
y^\hat{y}, which is
(yy^).-(y-\hat{y}).

The second factor is the derivative of the prediction with respect to
w1w_1, which is simply
xx.

Therefore, the derivative is

Exercise

Calculate the derivative of the Error with respect to
w2w_2
and verify that it is precisely
(yy^).-(y-\hat{y}).